9 research outputs found
Fast Multi-frame Stereo Scene Flow with Motion Segmentation
We propose a new multi-frame method for efficiently computing scene flow
(dense depth and optical flow) and camera ego-motion for a dynamic scene
observed from a moving stereo camera rig. Our technique also segments out
moving objects from the rigid scene. In our method, we first estimate the
disparity map and the 6-DOF camera motion using stereo matching and visual
odometry. We then identify regions inconsistent with the estimated camera
motion and compute per-pixel optical flow only at these regions. This flow
proposal is fused with the camera motion-based flow proposal using fusion moves
to obtain the final optical flow and motion segmentation. This unified
framework benefits all four tasks - stereo, optical flow, visual odometry and
motion segmentation leading to overall higher accuracy and efficiency. Our
method is currently ranked third on the KITTI 2015 scene flow benchmark.
Furthermore, our CPU implementation runs in 2-3 seconds per frame which is 1-3
orders of magnitude faster than the top six methods. We also report a thorough
evaluation on challenging Sintel sequences with fast camera and object motion,
where our method consistently outperforms OSF [Menze and Geiger, 2015], which
is currently ranked second on the KITTI benchmark.Comment: 15 pages. To appear at IEEE Conference on Computer Vision and Pattern
Recognition (CVPR 2017). Our results were submitted to KITTI 2015 Stereo
Scene Flow Benchmark in November 201
Semi-Global Stereo Matching with Surface Orientation Priors
Semi-Global Matching (SGM) is a widely-used efficient stereo matching
technique. It works well for textured scenes, but fails on untextured slanted
surfaces due to its fronto-parallel smoothness assumption. To remedy this
problem, we propose a simple extension, termed SGM-P, to utilize precomputed
surface orientation priors. Such priors favor different surface slants in
different 2D image regions or 3D scene regions and can be derived in various
ways. In this paper we evaluate plane orientation priors derived from stereo
matching at a coarser resolution and show that such priors can yield
significant performance gains for difficult weakly-textured scenes. We also
explore surface normal priors derived from Manhattan-world assumptions, and we
analyze the potential performance gains using oracle priors derived from
ground-truth data. SGM-P only adds a minor computational overhead to SGM and is
an attractive alternative to more complex methods employing higher-order
smoothness terms.Comment: extended draft of 3DV 2017 (spotlight) pape
画像領域分割と対応点推定問題への離散最適化アプローチ
学位の種別: 課程博士審査委員会委員 : (主査)東京大学教授 相澤 清晴, 東京大学教授 佐藤 洋一, 国立情報学研究所教授 佐藤 真一, 東京大学教授 苗村 健, 東京大学准教授 山崎 俊彦, 早稲田大学教授 石川 博University of Tokyo(東京大学
Neural Structure Fields with Application to Crystal Structure Autoencoders
Representing crystal structures of materials to facilitate determining them
via neural networks is crucial for enabling machine-learning applications
involving crystal structure estimation. Among these applications, the inverse
design of materials can contribute to next-generation methods that explore
materials with desired properties without relying on luck or serendipity. We
propose neural structure fields (NeSF) as an accurate and practical approach
for representing crystal structures using neural networks. Inspired by the
concepts of vector fields in physics and implicit neural representations in
computer vision, the proposed NeSF considers a crystal structure as a
continuous field rather than as a discrete set of atoms. Unlike existing
grid-based discretized spatial representations, the NeSF overcomes the tradeoff
between spatial resolution and computational complexity and can represent any
crystal structure. To evaluate the NeSF, we propose an autoencoder of crystal
structures that can recover various crystal structures, such as those of
perovskite structure materials and cuprate superconductors. Extensive
quantitative results demonstrate the superior performance of the NeSF compared
with the existing grid-based approach.Comment: 16 pages , 6 figures. 13 pages Supplementary Informatio